bears %>%
count(month) %>%
ggplot(aes(x = month, y = n)) +
geom_point() +
geom_line()Practice with ggplot2
Where should I put the aes() bit?
If you put it at the “top level” inside ggplot(aes(...)), the mapping will apply to all levels. For example:
In contrast, if you put the aes() mapping inside a single geometry layer, it will only apply to that layer. For example, this will cause an error since the geom_line() part doesn’t have an aesthetic mapping:
bears %>%
count(month) %>%
ggplot() +
geom_point(aes(x = month, y = n)) +
geom_line()#> Error in `geom_line()`:
#> ! Problem while setting up geom.
#> ℹ Error occurred in the 2nd layer.
#> Caused by error in `compute_geom_1()`:
#> ! `geom_line()` requires the following missing aesthetics: x and y
Main geoms
geom_point()
Basic scatterplot:
mpg %>%
ggplot() +
geom_point(aes(x = displ, y = hwy))Change color for all points:
mpg %>%
ggplot() +
geom_point(aes(x = displ, y = hwy), color = 'blue')To change color based on a variable, map the variable to color in aes():
mpg %>%
ggplot() +
geom_point(aes(x = displ, y = hwy, color = class)) Map the shape instead of color (usually not a great idea):
mpg %>%
ggplot() +
geom_point(aes(x = displ, y = hwy, shape = class)) What happened to SUV?
geom_line() vs. geom_smooth()
geom_line() connects all the dots:
mpg %>%
ggplot() +
geom_line(aes(x = displ, y = hwy))The reason this looks messy is because geom_line() is trying to literally connect every dot from left to right.
If you wanted a single “best-fit” trend line, use geom_smooth():
mpg %>%
ggplot() +
geom_smooth(aes(x = displ, y = hwy))Set se = FALSE to drop the error bounds:
mpg %>%
ggplot() +
geom_smooth(aes(x = displ, y = hwy), se = FALSE)geom_col()
For these examples, I’m creating a smaller summary data frame first that just counts how many rows there are for each class:
mpg %>%
count(class)#> # A tibble: 7 × 2
#> class n
#> <chr> <int>
#> 1 2seater 5
#> 2 compact 47
#> 3 midsize 41
#> 4 minivan 11
#> 5 pickup 33
#> 6 subcompact 35
#> 7 suv 62
Basic bar plot of the counts:
mpg %>%
count(class) %>%
ggplot() +
geom_col(aes(x = class, y = n), width = 0.7) # width is width of barsRe-order bars based on count using reorder():
mpg %>%
count(class) %>%
ggplot() +
geom_col(aes(x = reorder(class, n), y = n), width = 0.7)To change the color for all bars, use fill (not color):
mpg %>%
count(class) %>%
ggplot() +
geom_col(aes(x = reorder(class, n), y = n), fill = 'blue', width = 0.7)To change color based on a variable, map the variable to fill in aes():
mpg %>%
count(class, drv) %>% # Note I had to include drv in the count too
ggplot() +
geom_col(aes(x = reorder(class, n), y = n, fill = drv), width = 0.7)Use position = 'dodge' to change from stacked to side-by-side:
mpg %>%
count(class, drv) %>% # Note I had to include drv in the count too
ggplot() +
geom_col(
aes(x = reorder(class, n), y = n, fill = drv),
position = "dodge", width = 0.7)Practice
mpg %>%
ggplot() +
geom_smooth(aes(x = displ, y = hwy, color = drv))mpg %>%
count(class, drv) %>%
ggplot() +
geom_col(aes(x = drv, y = n, fill = class), width = 0.7)mpg %>%
ggplot(aes(x = displ, y = hwy)) +
geom_point(aes(color = class)) +
geom_smooth(se = FALSE)Facets
Facets make multiple small charts and are useful when you have many levels in a categorical variable.
For example, this plot has too many color categories for the color to be useful:
mpg %>%
ggplot(aes(x = displ, y = hwy)) +
geom_point(aes(color = class))Instead, we can use facet_wrap() to show multiple charts of each vehicle class:
mpg %>%
ggplot(aes(x = displ, y = hwy)) +
geom_point() +
facet_wrap(~class)You can also use facet_grid() to facet by two variables:
mpg %>%
ggplot(aes(x = displ, y = hwy)) +
geom_point() +
facet_grid(drv ~ cyl)Extra Practice
bears %>%
count(year, gender) %>%
ggplot() +
geom_col(aes(x = year, y = n, fill = gender)) +
labs(
x = "Year",
y = 'Number of killings',
fill = "Victim gender",
title = "Annual deadly bear attacks over time"
) +
theme_bw()mpg %>%
mutate(manufacturer = str_to_title(manufacturer)) %>%
group_by(manufacturer) %>%
summarise(mean_hwy = mean(hwy)) %>%
ggplot() +
geom_col(aes(x = mean_hwy, y = reorder(manufacturer, mean_hwy)), width = 0.9) +
labs(
x = 'Highway fuel economy (mpg)',
y = 'Vehicle manufacturer',
title = 'Mean fuel economy by automaker'
) +
theme_minimal()